Goto

Collaborating Authors

 change rate




Evidence for Limited Metacognition in LLMs

Ackerman, Christopher

arXiv.org Artificial Intelligence

The possibility of LLM self-awareness and even sentience is gaining increasing public attention and has major safety and policy implications, but the science of measuring them is still in a nascent state. Here we introduce a novel methodology for quantitatively evaluating metacognitive abilities in LLMs. Taking inspiration from research on metacognition in nonhuman animals, our approach eschews model self-reports and instead tests to what degree models can strategically deploy knowledge of internal states. Using two experimental paradigms, we demonstrate that frontier LLMs introduced since early 2024 show increasingly strong evidence of certain metacognitive abilities, specifically the ability to assess and utilize their own confidence in their ability to answer factual and reasoning questions correctly and the ability to anticipate what answers they would give and utilize that information appropriately. We buttress these behavioral findings with an analysis of the token probabilities returned by the models, which suggests the presence of an upstream internal signal that could provide the basis for metacognition. We further find that these abilities 1) are limited in resolution, 2) emerge in context-dependent manners, and 3) seem to be qualitatively different from those of humans. We also report intriguing differences across models of similar capabilities, suggesting that LLM post-training may have a role in developing metacognitive abilities.



Consensus Tracking Control of Multi-agent Systems with A Time-varying Reference State under Binary-valued Communication

Wang, Ting, Qiu, Zhuangzhuang, Lu, Xiaodong, Zhao, Yanlong

arXiv.org Artificial Intelligence

This paper investigates the problem of consensus tracking control of discrete time multi-agent systems under binary-valued communication. Different from most existing studies on consensus tracking, the transmitted information between agents is the binary-valued. Parameter identification with binary-valued observations is applied to the estimation of neighbors'states and the tracking control is designed based on the estimation. Two Lyapunov functions are constructed to deal with the strong coupling of estimation and control. Compared with consensus problems under binary-valued communication, a reference state is required for consensus tracking control. Two scenarios of the time-varying reference state are studied respectively. (1) The reference state is asymptotically convergent. An online algorithm that performs estimation and control simultaneously is proposed, in which the estimation step size and the control gain are decreasing with time. By this algorithm, the multi-agent system is proved to achieve consensus tracking with convergence rate O(1/k^{\epsilon} ) under certain conditions. (2) The reference state is bounded, which is less conservative than that in the first case. In this case, the estimation step size and control gain are designed to be constant. By this algorithm, all the followers can reach to a neighborhood of the leader with an exponential rate. Finally, simulations are given to demonstrate theoretical results.


Aneumo: A Large-Scale Comprehensive Synthetic Dataset of Aneurysm Hemodynamics

Li, Xigui, Zhou, Yuanye, Xiao, Feiyang, Guo, Xin, Zhang, Yichi, Jiang, Chen, Ge, Jianchao, Wang, Xiansheng, Wang, Qimeng, Zhang, Taiwei, Lin, Chensen, Cheng, Yuan, Qi, Yuan

arXiv.org Artificial Intelligence

Intracranial aneurysm (IA) is a common cerebrovascular disease that is usually asymptomatic but may cause severe subarachnoid hemorrhage (SAH) if ruptured. Although clinical practice is usually based on individual factors and morphological features of the aneurysm, its pathophysiology and hemodynamic mechanisms remain controversial. To address the limitations of current research, this study constructed a comprehensive hemodynamic dataset of intracranial aneurysms. The dataset is based on 466 real aneurysm models, and 10,000 synthetic models were generated by resection and deformation operations, including 466 aneurysm-free models and 9,534 deformed aneurysm models. The dataset also provides medical image-like segmentation mask files to support insightful analysis. In addition, the dataset contains hemodynamic data measured at eight steady-state flow rates (0.001 to 0.004 kg/s), including critical parameters such as flow velocity, pressure, and wall shear stress, providing a valuable resource for investigating aneurysm pathogenesis and clinical prediction. This dataset will help advance the understanding of the pathologic features and hemodynamic mechanisms of intracranial aneurysms and support in-depth research in related fields. Dataset hosted at https://github.com/Xigui-Li/Aneumo.


Is Smoothness the Key to Robustness? A Comparison of Attention and Convolution Models Using a Novel Metric

Chen, Baiyuan

arXiv.org Artificial Intelligence

Robustness is a critical aspect of machine learning models. Existing robustness evaluation approaches often lack theoretical generality or rely heavily on empirical assessments, limiting insights into the structural factors contributing to robustness. Moreover, theoretical robustness analysis is not applicable for direct comparisons between models. To address these challenges, we propose $\textit{TopoLip}$, a metric based on layer-wise analysis that bridges topological data analysis and Lipschitz continuity for robustness evaluation. TopoLip provides a unified framework for both theoretical and empirical robustness comparisons across different architectures or configurations, and it reveals how model parameters influence the robustness of models. Using TopoLip, we demonstrate that attention-based models typically exhibit smoother transformations and greater robustness compared to convolution-based models, as validated through theoretical analysis and adversarial tasks. Our findings establish a connection between architectural design, robustness, and topological properties.


AttnMod: Attention-Based New Art Styles

Su, Shih-Chieh

arXiv.org Artificial Intelligence

Imagine a human artist looking at the generated photo of a diffusion model, and hoping to create a painting out of it. There could be some feature of the object in the photo that the artist wants to emphasize, some color to disperse, some silhouette to twist, or some part of the scene to be materialized. These intentions can be viewed as the modification of the cross attention from the text prompt onto UNet, during the desoising diffusion. This work presents AttnMod, to modify attention for creating new unpromptable art styles out of existing diffusion models. The style-creating behavior is studied across different setups.


Market Reaction to News Flows in Supply Chain Networks

Inoue, Hiroyasu, Todo, Yasuyuki

arXiv.org Artificial Intelligence

This study examines whether positive news about firms increases their stock prices and, moreover, whether it increases stock prices of the firms' suppliers and customers, using a large sample of publicly listed firms across the world and another of Japanese listed firms. The level of positiveness of each news article is determined by FinBERT, a natural language processing model fine-tuned specifically for financial information. Supply chains of firms across the world are identified mostly by financial statements, while those of Japanese firms are taken from large-scale firm-level surveys. We find that positive news increases the change rate of stock prices of firms mentioned in the news before its disclosure, most likely because of diffusion of information through informal channels. Positive news also raises stock prices of the firms' suppliers and customers before its disclosure, confirming propagation of market values through supply chains. In addition, we generally find a larger post-news effect on stock prices of the mentioned firms and their suppliers and customers than the pre-news effect. The positive difference between the post- and pre-news effects can be considered as the net effect of the disclosure of positive news, controlling for informal information diffusion. However, the post-news effect on suppliers and customers in Japan is smaller than the pre-news effect, a result opposite to those from firms across the world. This notable result is possibly because supply chain links of Japanese firms are stronger than global supply chains while such knowledge is restricted to selected investors.


Unleashing the Power of Task-Specific Directions in Parameter Efficient Fine-tuning

Si, Chongjie, Shi, Zhiyi, Zhang, Shifan, Yang, Xiaokang, Pfister, Hanspeter, Shen, Wei

arXiv.org Artificial Intelligence

Large language models demonstrate impressive performance on downstream tasks, yet requiring extensive resource consumption when fully fine-tuning all parameters. To mitigate this, Parameter Efficient Fine-Tuning (PEFT) strategies, such as LoRA, have been developed. In this paper, we delve into the concept of task-specific directions--critical for transitioning large models from pre-trained states to task-specific enhancements in PEFT. We propose a framework to clearly define these directions and explore their properties, and practical utilization challenges. We then introduce a novel approach, LoRA-Dash, which aims to maximize the impact of task-specific directions during the fine-tuning process, thereby enhancing model performance on targeted tasks. Extensive experiments have conclusively demonstrated the effectiveness of LoRA-Dash, and in-depth analyses further reveal the underlying mechanisms of LoRA-Dash. The code is available at https://github.com/Chongjie-Si/Subspace-Tuning.